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ABSTRACT 

Differential game theory and dynamic programming are applied 
to derive the optimal strategy, or sequence of controls, for a class 
of linear, sampled-data, pursuit-evasion problems. The necessary 
and sufficient conditions for existence of the solution are derived. 
A digital computer program for simulating control generation and sys- 
tem trajectories is given. The results of simulation tests using this 
digital computer model are presented to demonstrate the salient 


characteristics of the strategy. 
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1. INTRODUCTION 

Recent literature in the field of automatic control theory shows an 
increasing interest in the application of differential game theory to 
control system design | 1, 2, 3 ]. This approach to control system 
design may be considered an extension of "optimal" control theory 
allowing one or more opposing systems controlled by intelligent 
opponents to be introduced into the design computations. ("In- 
telligent" is used here to mean having knowledge of the opponent's 
system dynamics, states, and desires.) Further, this approach assumes 
that the controller of each system will act in accordance with his in- 
telligence. This allows system design to provide satisfactory per- 
formance when an opponent acts to minimize that performance, and to 
provide improved performance when the opponent acts otherwise. 

The disadvantages of this approach to control system design are 
generally those attributed to optimal control theory in the past --- 
the difficulty of expressing realistic performance objectives in the 
mathematical form of a performance index, and the difficulty of ob- 
taining solutions to problems with realistic physical constraints. 
These difficulties are somewhat increased by the introduction of in- 


telligent opponents. 


2. STATEMENT OF THE PROBLEM 
This research was confined to exploring the solution and salient 
characteristics of the sampled-data analog of a continuous system 
pursuit-evasion problem and solution published by Y.C. Ho, A.E. 
Bryson, and S. Baron [3 ]. 
2.1 Differential Constraints 
The pursuit and evasion plants are described bya set of linear, 


time-invariant, first-order state equations [4 ]: 


x =Ax + Bu (1) 
—p p-p p- 
te > a SI SZ (2) 
—e e—-e em 


The following definitions apply to equations (1) and (2): 


x is the (n by 1) state vector of the pursuit plant 


p 
x is the 
—e 


(n by 1) state vector of the evasion plant 
m “ipetite (m by 1) control vector of the pursuit plant 
( 


I< 


is the m by 1) control vector of the evasion plant 
ae and A. are (n by n) matrices of constant coefficients 
B is an (n by me) matrix of constant coefficients 
B is an (n by m_) matrix of constant coefficients. 

Since the controls u and v are to be applied in sampled-data form, 
it is convenient to describe the plants by the following difference 


conStraint equations [4 ]: 


x, (k+1) B (tT) x, (k) #3 A (Tuk) (3) 


x (k+l) = Bf (T)x,(k) + O, (vik) (4) 


e 


where B and B are state transition matrices defined by: 


AnT 
@(T) =e? (S) 
p 
f(T) — ee where, (6) 
Za Jas 
AT AT AT AT 
e ~~ tt ane (7) 


The state distribution matrices are defined: 


Ap 
A(T) = Jy #,(T-t)B at (8) 


A(T) = fo BO (r-t)B at. (9) 


The integrals (8) and (9) can be evaluated by the infinite series: 


Z 23 3.4 
AT At ae 
A(T) = | IT + 21 + 7 + Tig +... B (10) 
2.2 Performance Index 


The performance index is defined: 


= [x tt) - xt) 1° OLX tt) - x(t) J 


al a =e 


_ (OR, (tute) % v" ()R v(t) Voi. ©. (11) 
The following additional definitions apply to the performance index: 
Q isa (nbyn) positive semi-definite, symmetric, final- 
state-difference weighting matrix. 
ee isa oe by na poSitive definite, symmetric, pursuit 
control weighting matrix. 
R, isa (m_, by m ) positive definite, Symmetric, evasion 
control weighting matrix. 
Since the controls are to be applied in sampled-data form 
(u and v constant for the sample period T) for N sample periods 


(t = NT), the performance index given in (11) is represented by: 


J= Ux 3(N) = x,(N) J QL x, (N) - x, (N) J 


N-1 
+T Z [u (k)R_(k)u(k) - v (WR, (e)vk) J]. (12) 
k=0 y 


The desired optimal solution is a set of controls: 


O 
S 


{u(0) , u(l), . . . u(N-1) } to minimize J, 


and 


O 
ai 


{v(0) , v(l), . . . v(N-1) } to maximize J. 


Equivalently, the saddle point 


Pe = T(u° : v°) Si U,, cal) is sought. (13) 


J(u 


It is assumed at this point that sucha saddle point does exist 


and that the optimal value of the performance index is: 


O - OmeO, min max _ max min 
ey) eee) = a5 siieenien i (14) 


The dominant characteristics of this performance index are as 
follows: 

1) The form of the performance index requires that the orders of 
the pursuit and the evasion plants be the same (otherwise a and Xx, 
would not be conformable for subtraction). The dimensions of the 
pursuit and evasion control vectors, however, need not be the same 
= need not equal m ) : 

2) The final state difference weighting matrix Q is chosen 
positive semi-definite in order to reflect an increase in the performance 
indextort h increase*in the difference of theystates of interestvenly 
of the two plants at the final time. 

3) The Re matrix is chosen positive definite in order to reflect 
an increase in the performance index for the use of any pursuit control. 

4) The R, matrix is chosen positive definite in order to reflect 
a decrease in the performance index for the use of any evasion control. 

5) The matrices Q, Re , and Re are chosen symmetric since this 
choice involves no loss of generality in the quadratic form but makes 


subsequent statements about the solution simpler. 
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3. DERIVATION OF THE OPTIMAL STRATEGY 

The derivation given here was first done by Dr. Donald E. Kirk, 
Assistant Professor of Electrical Engineering at the Naval Postgraduate 
School. It has been expanded slightly to include a set of sufficiency 
conditions and to make its application as general as possible. 
3.1 Minimaximization of the Performance Index 

The optimal controls are found by applying the dynamic pro- 
gramming approach of Bellman [5 ]. The "cost" of operation during 


the final (Nth) stage of operation is given by 


3, = Cx) - x0) 1° QL xn) - x0) 1 


+ w (N-DR, (N-1)u(N-1) -v' (N-1)R,(N-1)v(N-1). (15) 


The effect of the sample period in equation (12) can be accounted 
for by multiplying each element in the Sic and Ro matrices by T. T is 
omitted for convenience at this point to be reintroduced later in the 
derivation. Since = (N) and x, (N) depend only on x (Net), X, (N-1), 
u(N-1), and v(N-1), substituting equations (3) and (4) into equation 
(15) and neglecting the indices, which are all (N-1), yields 


- A 
a LP Xs Au = Mike Av] OUP phe BP Av 


T ih 


ae | (16) 
= “p= ae ge 
It is Convenient to define: 
Z{N-1), = Bx (N-1) - §S OF), (17) 


Note that the z(N-1) vector is the projected difference in states at 
the final time if no control is applied during the final (Nth) stage of 
operation. 


Using equation (17) in (16) and defining P(0) = Q: 


an T 
J, =z + Auu - 4.v J P()LzZ+ Au - 4,v | 


tuRu-wWRyv . (1 8) 


Defining qJ, (du) and dy, (dv) as the increments of J, from its optimal 
value caused by small variations du(N-1) and dv(N-1), respectively, 


from the optimal values u (N-1) and v- (N-1) gives: 


_—— a“ —e O T ii 
aj, (du) = 2tLd. P(0)A, + Ri du 4, P(0)A.v +A. P(0)z} du 


Poy tn POA PR de (19a) 
—_ p p bP ae 


aJ, (dv) = 2{ [A,"P(0)A, - R, Iv° - A, P(A uo - O° P(0)z} “dv 


Jy Ag 
+ 2dv [A “P(0)A,-R dv. (19b) 


Inspection shows that the necessary conditions for the desired 
saddle point are that the first order terms in du and dv in equations 


(19) must be zero. Then: 


T O Als O t 
A PAOLA. - R -A P(0)A +A P(0)z = 0 20 
LA P(0)A, piu pe auieny 5 Pe 0 (20) 
ai O 5 O a a 
-h. P(0)4u mee P(O)A, - Ri Jy - A, P72 = Ow. (21) 


Assuming that the inverse required below exists, and solving for 


O ow 
u andv gives: 


=i 
| 
o| | m pon. 2 eel -a Tio) -a 'P(o) 
Bel) yy Be eS a get oe ie 
{ — 
me -B,"P(0)A ae *P(0)d, = A *P(0) 
(22) 
It is convenient to rewrite equation (22) in the form: 
°(N-1) F (1) 
a aay .— || aN =1) (23) 
v (N-1) P_(1) 


WZ 


Substitution of the above expression for the optimal controls u> (N-1) 
°( 


and N-1) into equation (18) yields the optimal "cost" of operation 


over the last (Nth) stage of operation: 


J, (z@1)]= iis + (OF 2 - AF z) 1 P(0){z + AF 2 - OF 2] 


7 ais be Fz- 2iF R z (24) 

— pp — e& Ee 
defini 1) =I + F (l)- 25 
Now defining yw(1) =I 4, x ) APO), (25) 


equation (24) can be rewritten: 


J, °Lz (01) 3= 27y7(1)PCO)$ (z+ 2° F(R (N-1)F_ (Wz 


ZT 
-Z2 FP (I)R(N-1)P (lz. (26) 


Now defining 
P(i) = b  (i)P(i-1)¥ (i) + PGR, (N-i)F a F.G)R, (N-i)F. (i) 8 27) 


equation (26) can be rewritten: 


J °Cz(N-1)] = 2 (N-1)P(1)z(N-1) . (28) 


The dynamic programming process is continued by considering 


the cost of operation over the last two stages of operation. 


J, =u" (N-2)R, (N-2) wiN-2) ~ v" (N-2)R,(N-2)¥(N-2) + J,CZ(N-1)] (29) 
Oo min max 
Ig = yoren), wave2) | vin-2y) waves) J ay) 


Since u(N-1) and v(N-1) affect only the J, portion of equation (29): 


J,° = reall —— uw’ (N-2)R, (N-2)u(N-2) - v" (N-2)R, (N-2)v(N-2) 


min max 
* u(N-1) , ales al oo 
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Equation (31) can be rewritten using the definition of iv in 
equation (14): 
o | ain max Ah 


: v’ (N-2)R, (N-2)v(N-2) pl, ] (32) 


Substituting equation (28) into (32) gives: 


hau ule rs Cu" (N-2)R, (N=2)u(N-2) - vo (N-2)R, (N-2)v (N-2) 


+ Zw-ryeaew-03| (33) 


Relating z(N-1) to BN 2), x (N-2) , u(N-2), and v(N-2) yields: 


@(N= es Bx, (N-1) - 0x, (N-1) (34) 

z(N-1) = $1 x(N-2) + &u(N-2)]- P18 x, (N-2) + Ov(N-2)] (35) 
2 2 

2(N-1) =9,°x (N-2) - 9," (N-2) + BD u(N-2) - Bd,v(N-2) . (36) 


Extending the definition of z, define: 


ie. ti oe 
z(N-i) = XN i) L. x, (N i al (37) 

It is noted that z(N-i) is the projected "miss distance" if no control 

is applied to either the pursuer or evader after the (N-i)th stage of 

operation. 


It is also convenient to define: 


be i 

4, (i) = B, . (38) 
—_ i ‘ 

4, (i) = B. “. ’ (39) 
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EE 


——— 


- eee OeeeeS”—~—“‘“— 


Substituting equations (37), (38), and (39) into equation (36) gives: 
z(N-1) = z(N-2) + 4 )u(n-2) - A (1)v(N-2) : (40) 


Substituting equation (40) into equation (33), and omitting all (N-2) 


subscripts gives: 


Oo min max 7 si 
J, [z(N-2)] = u(N-2) | come Ru - v Rov 


+{[z2+ A Q)u - 4, (1)v J To)Lz + 4 (lu - 2,00] , (an) 


Comparison of equation (41) and equation (18) shows them to be 
identicalin.form. It is apparent that the solution shown in equation 
(23) for the last stage of operation can be generalized for all stages 


of operation: 


5388. = |---| z(N-i) = F(i)z(N-i) . (42) 


The recursive relationships necessary to compute the F(i) matrices 


for all sample periods are as follows: 


BAO) ages (43a) 
A, (0) = A, (43b) 
4, (0) = A, (43c) 
F (i) 
ua | 
A 1 (i-1)P(i-1)A (i-1) + R (N-i) : 202 (i-1)Pi-1) ace 7 
ee Pe el ali eae oi a oe 
-, (i-1)P 4-1), -1) | A. * (i-1)P(i-1)4, (i-1) - R,(N-i) 
-A + (i-1)P(iel) 
x | --=----------- (43d) 


1 


efi) = CI+ 4 G-1)P Gi) = A (i-1)F Gi) ] (43e) 


- F(R, (N-a)F, (i) (43f) 
i) = Bei 1) (43g) 
Ai) = PAG). (43h) 


The relationships necessary to generate the optimal controls are: 


2(dri) = 9x (nei) - "x, (N-i) (44a) 
u”(N-i) F (i) 
ae. = |---] 2(N-i) (44b) 
v°(N-i) F (i) 


The effect of the sample period on the solution, which was 
neglected after equation (12), can now be readily accounted for by 
multiplying each element of the Ne and R, matrices by the sample 
period. 

3.2 Sufficiency Conditions 

It has been assumed thus far that the saddle point described in 
equations (13) and (14) exists. Examination of equations (19) shows 
that such a saddle point is indeed reached by applying the control 
law described by equations (43) and (44) if the following conditions 
are met: 

a) the first order terms in du and dv in equations (19) must be 
identically zero, 

b) the second order term in du in equation (19a) must be 
positive definite, and 

c) the second order term in dv in equation (19b) must be 


negative definite. 
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It is convenient here to identify the matrix which requires inversion in 
equation (43d) as the matrix D, and the sub-matrices into which it is 


partitioned as follows: 


D= |-----s-- (45) 


An equivalent set of sufficient conditions can now be stated as 
follows: 

a) the inverse of the D matrix must exist, 

b) the matrix D,, must be positive definite, and 


Lil 


c) the matrix dD,» must be negative definite. 


A moment's examination of equation (43d) reveals that the D 
matrix is symmetric. It can be shown that the inverse of the symmetric 
matrix D exists if the inverse of D,, and the inverse of € exist l6 ], 
where: 


ma (46) 


foa-eD,. —- DaiD 12 


22 Fpl 


Therefore, the necessary and sufficient conditions to insure the 
required saddle point are: 

a) 4° Gi-1)PG-1)4, (i-1) + RagiN~i) must be positive definite 
for all i from one to N, and 

b) A *(i-1)P(i-1)4, (i-l) - R,(N-i) must be negative definite 


for all i from one to N. 


E7 


4. DIGITAL COMPUTER SIMULATION PROGRAM 

In order to simulate the results of applying the control strategy 
derived in this thesis to various systems, the digital computer pro- 
gram OPTIMAL2 was employed. ThisS program was written in FORTRAN 
63 computer language for use with a Control Data Corporation 1604 
Computer. The complete program is given in Appendix A. 

The program OPTIMAL2 was written to have as general an appli- 
cation as possible within the restrictions imposed in the problem 
statement and derivation. To this end, pursuit and evasion system 
descriptions and the performance index to be minimized are read in as 
data. (The pursuit and evasion systems are completely described by 
the differential equation coefficient matrices aa Aas ee Boe and the 
Sample veriod T. The performance index to be minimized is complete- 
ly described by the weighting matrices Q, So! Roe the sample period 
T, and the number of samples N. The initial conditions 2g), and 
x (0) are required for trajectory simulation only.) The program will 
accept any system so described with state vector dimensions of ten 
or less and with control vectors of dimensions such that the sum for 
both plants does not exceed ten. Printout of input data is provided 
for verification purposes. 

The program uses subroutine PHIDEL to compute the matrices Be 
B. ; 4, , and A. for the systems described by the input data. This 
computation is effected by programming truncated forms of the in- 
finite series given in equations (7) and (10). Printout of these 
matrices is provided for reference purposes. 

The program uses Subroutine OPTCONZ2 to generate and record 
on magnetic tape for later use the "feedback gain" matrices Pt 
and P(t) and the state transition matrices a and g.. These 
comoutations are effected by programming equations (43). Comment 
Statements in the subroutine indicate the appropriate location for 
additional statements required to read from data or to generate a 


"time-varying" form of the control weighting matrices ee and 
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Re (N-i). (No correction is required if these matrices are constant.) 
Provision is made for printout of a number of the feedback-gain matrices 
and the state transition matrices, and for graphical output of the (1,1) 
elements of the feedback-gain matrices to show trends in their variation. 
Provision is also made for a printout indication if the sufficiency con- 
ditions for the required saddle point are not met. 

The program then applies the matrices generated by OPTCONZ to 
compute the sequence of optimal controls by programming equations 
(44), and generates a Simulated trajectory for each system. The 
value of the performance index is computed for these trajectories, 
and is provided as a printed output. Comment statements near the 
end of the program OPTIMAL2 indicate where changes of the output 
statements may be required to present the trajectories of different 
systems in usable form. 

Figure 1 shows a vector flow-graph model for implementing the 
optimal pursuit-evasion strategy. The optimal strategy is defined 
(the feedback-gain matrices may be computed for the largest number 
of sample periods likely to be required) when the dynamics of the 
pursuit and evasion plants are known, and it may be recorded. There- 
fore, "on-line" or "real-time" computation of the strategy is not 
required. 

Examples of the required form of input data and examples of 
printed output are presented in Appendix B. Examples of variations 
on the basic program for different systems are presented in Appendix 


On 


is9 


PURSUER 








u_(k) 
A 
p 
F (N-k) 
+ 
z(k) 2 
E N-k 
P(N k) p 
x, (k+1) 
TAN 
S 
u (k) x(k) 
EVADER 


VECTOR FLOWGRAPH MODEL 
FIGURE 1 


20 


9. RESULTS OF COMPUTER SIMULATION TESTS 
Using the computer simulation program described in the pre- 
ceding section, three series of tests were run to investigate different 
characteristics of the optimal pursuit-evasion strategy. The results are 
reported below. 


The following definitions are used to simplify the recording of 


data: 
2(N) = X(N) - x, (N) ; (47) 
N Ae Ak 
CostR = CZ [u (k)R_u(k) - v (k)R_v(k)], (48) 
k=0 P : 
CostQ = z* (N)Qz(N) , and (49) 
J = Cost R+ Cost Q. (50) 


9.1 The Effect of Varying the Performance Index Parameters 

Run lisa standard intercept to which subsequent runs are gen- 
erally compared. The pursuit and evasion plants are identical, un- 
damped, spherical, unit masses traveling in a two-dimensional 
(north-east) Space. The state vectors and control vectors for these 


plants are defined: 


north 
x = north — u= north (51) 
east east 
| east , 


The computer input data and terminal performance data are given 
below. The resulting trajectories and the time variation of the (1,1) 
elements of the feedback-gain matrices are shown in Figures 2 and 3. 
A complete set of input data showing the required format and a com- 


plete computer printout of the numerical results for this run are given 


in Appendix B, 


Za) 


Input Data 


Cie oO 0 0 
000 0 1 0 
A = ey 000] B ceo O OL 
, pone 0, R om, 
1000 
a=joooof r=[So] ae [59] 
0010 : ss 
0000 ; 
0 500 
5 -10 N = 20, 
<0) — | 0 x (0) = 0 _ 
—»_ Bil —e 10 fe= | 
Terminal Performance Data 
- .0938 
30.0020 CostR = 23.4380 
z(N} | - .0469 Cost Q= - Opal: 
ia J = 23,441 


The use of this particular semi-definite Q matrix defines position 
only as ane “states-of-interest" for the performance index. The effect 
of this choice is readily apparent in the resulting relative difference 
in the magnitudes of the velocity and position elements of the state 
Grrrerente vector at the final time, z(N), ame@sin the “intercept” tra- 
jectories in Figure 2. 

Examination of the series of feedback-gain matrices (see 
Appendix B), which define the optimal strategy for this run, shows 
that the required acceleration for the pursuer is always parallel to the 
projected terminal "miss-distance" vector, soas to provide maximum 
reduction in the projected miss-distance for a given acceleration 
magnitude. The required acceleration for the evader is inthe.same 
direction (and smaller in magnitude). This situation is peculiar to 


the special case where both systems are point masses, the controls 


aD 


Ble 


Figure 2 


Standard Intercept Trajectories 


X-SCALE = 1,006+02 UNITS“INCH 
Y-SCALE = 1.00£+02 UNITS~“INCH, 
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We 


Figure 3 


Time Variation of the (1,1) Elements 
of the 
Feedback-gain Matrices 
S for the 
Standard Intercept 


p 
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ei 
> 
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T-SCAE = 1006-01 UNITS’ INCH 


Peter Pe US TIME 
Pe > CHISON PRIMES T Bee SZ670PTML.2 RUIN “ie 2ed G7 





are accelerations, and the final state weighting matrix Q indicates 
interest only in position error at the final time. This physical in- 


terpretation of the strategy, previously stated only in mathematical 


form, lends an intuitive appeal. 
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Run 2 shows the effect of increasing the time for pursuit. The 
input data and the terminal performance data are given below. The 
resulting trajectories are shown in Figure 4. 

Input Data 

Identical to Run 1 except: N = 25. 


Terminal Performance Data 


mK Y0 CostR = 7.4994 
7 22.501 CostQ = .0018 

se 20 
2.500 J = 7.5012 


The resulting trajectories differ from the standard of Run 1 in 


that: 
a) the terminal "miss-distance" is less, and 
b) smaller magnitudes of acceleration are required. 


The series of feedback-gain matrices is identical to that for Run l. 
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Run 3 shows the effect of decreasing the time for pursuit. The 
input data and the terminal performance data are given below. The 
resulting trajectories are shown in Figure 5. 

Input Data 

Identical to Run 1 except: New= 15. 


Terminal Performance Data 


= Voeme CostR = em@nli7 

; 42.5010 C65 Oo re 
m= 1 aaa 

2.5000 i =e 


The resulting trajectories differ from the standard of Run 1 in 
that: 

a) the terminal "miss-distance" is larger, and 

b) larger magnitudes of acceleration are required. 


The series of feedback-gain matrices is identical to that for Run l. 
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Run 4 shows the effect of reducing the magnitude of the elements 
of the final-state weighting matrix, Q. The input data and the terminal 


performance data are given below. The resulting trajectories are shown 


in Figure 6. 
Input Data 


Identical to Run 1 except: 


+00 0 
Saninew0. 0 0 
OF jog + 0 
00" 0 


Terminal Performance Data 


- 1874 Compe] 23ndae 
| 29.9950 ect © Oe 

2Z(N) = | _ 9937 
2.4977 J = 23.430 


The resulting trajectories differ from the standard of Run 1 in 


that: 
a) the terminal "miss-distance" is larger, and 


b) the required acceleration magnitudes are smaller. 
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Figure 6 


Intercept Trajectories 
for 
Reduced Final State Error Weighting 
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Run_9 shows the effect of making the final-state weighting 
matrix, Q, an identity matrix. The input data and the terminal perform- 
ance data are given below. The resulting trajectories are shown in 
Figure 7. 

Input Data 


Identical to Run 1 except: 


ime O° 0 
2= lo 0 1 0 
C0) Ce 
Terminal Performance Data 
- .9440 Gesters — 168-20 
ain) = _—_, | C@stQ = 36.59 
9004 J = 20429 


The resulting trajectories differ from the standard of Run 1 in 
that: 

a) the terminal "miss-distance" is larger, 

b) the terminal velocity difference vector is smaller, and 

‘ c) the required acceleration for this strategy is no longer always 
parallel to the projected terminal "miss-distance" vector. 

The use of this particular Q matrix defines both position and 
velocity as "states-of-interest" for the performance index. Com- 
paring the terminal performance data and trajectories for this run with 
those for Run 1 shows a resulting "rendeZvous" instead of an 


"intercept". 
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Figure 7 


Uncooperative Rendezvous Trajectories 


OA 1 


K-SCALE = 1.006+82 LNITS“INCH 
Y-SCALE = 1LQQF+02 UNITS“INCH, 


MOR "YS EAS. AURGUAT sEVESION “RGJEC PORTES 
HLT CHISON, PERMENTER,628,O0PTML.2 RUN oO 2 1 & 


Run 6 shows the effect of decreasing the magnitude of the elements 
of the evasion control weighting matrix, Ro: The input data and the 
terminal performance data are given below. The resulting trajectories 
are shown in Figure 8. 

Input Data 

Identical to Run 1 except: 
3 
Ms * é | 


Terminal Performance Data 


= idee CostR = 28/711 

» 30.0010 CostQ =  .016 
Z(N)= | _ (9563 

2.5005 T = 28.127 


The resulting trajectories differ from the standard of Run 1 in 
that: 

a) the terminal "miss-distance" is larger, and 

b) the required acceleration for the evader is larger. 

The physical implication of this reduction in the magnitude of 
the elements of the R matrix is to make the evader more maneuver- 


able. 


34 


Figure 8 
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Run 7 shows the effect of increasing the magnitude of the elements 
of the pursuit control weighting matrix, st The input data and the 
terminal performance data are given below. The resulting trajectories 
are shown in Figure 9. 

Input Data 

Identical to Run 1 except: 
no (a | 


Terminal Performance Data 


- .2498 Cost R = 62.383 

29.9910 CostQ = 20778 
z(N) = - ,0469 

2.4977 J = 62.461 


The resulting trajectories differ from the standard of Run 1 in 
that: 

a) the terminal "miss-distance" is larger, and 

b) the required acceleration for the pursuer is less. 

The physical implication of increasing the magnitude of the 


elements of the = matrix is to make the pursuer less maneuverable. 
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Run 8 shows the effect of decreasing the sampled-data period 
without changing the final time. The input data and the terminal per- 
formance data are given below. The resulting trajectories and the 
time variation of the (1,1) elements of the feedback-gain matrices 
are shown in Figures 10 and ll. 

Input Data 


Identical to Run 1 except: 


1 a 
= ee = =| 2 0 = oe 0 
Te=gy N=40, = 3 a] Ro I ‘ND 
Terminal Performance Data 
- ,0937 CostR = 23.419 
29.9950 CostQ = .016 
z(N) = 
ae - ,0469 
2.4997 jee Cowes 0 


The resulting trajectories differ from the standard of Run 1 in 
that: 

a) the terminal "miss-distance" is slightly less, and 

b) the acceleration required for both plants is slightly less. 

The trajectories for this run are almost identical to those for 
Run 1, showing only slight improvement in performance for the smaller 
sample period. This demonstrates the ability to use this method with 
small sample periods to approximate the Solution for a system with 


continuous controls and time varying feedback-gain matrices. 
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Figure 11 


$s Time Variation of the (1,1) Elements 
of the 
Feedback~-Gain Matrices 
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Reduced Sample Period Intercept 
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Run 9 shows the effect of reducing the magnitude of the elements 
of the evasion control weighting matrix, Ro enough to violate the 
necessary conditions for the existence of the required saddle point. 
The input data and the terminal performance data are given below. The 
resulting trajectories are shown in Figure 12. 

Input Data 


Identical to Run 1 except: 


_ .4 0 
a r 3 


Terminal Performance Data 


0500 CostR = -12.514 

_ {30.0132 Cosmo. = 003 
2(N) = 0250 

2 SOs ee 


The test for the necessary conditions for the saddle point failed 
on the second iteration within subroutine OPTCON2. 

The effect of the failure to satisfy the necessary conditions for 
the saddle point is easily seen in the trajectories of Figure 12. In 
this case the pursuer runs from the evader, and the evader pursues 
the pursuer. Further study is indicated regarding violations of the 


necessary conditions for the saddle point. 
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Figure 12 


Intercept Trajectories Illustrating 
Violation of Necessary Conditions 
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Run 10 is made to determine the point at which decreasing the 
magnitude of the elements of the evasion control weighting matrix, 
Ra causes failure of the test for the required saddle point. No 
trajectories are simulated. The strategy was computed for input data 
identical to that for Run 1 except for the values of the Ro matrix 
which are given below. The existence of the saddle point was tested 


for 100 sample periods, and the results are as follows: 


2.7 0 
R = 
- 0 i ae saddle point exists for N s 100; 
Brn 0 
Ro 
0 eto |; saddle point exists for N s 100; 
28 0 
R, = 
0 aaorl, saddle point does not exist for N 2 3; 
0.4 0 
Re = 
0 0.4], saddle point does not exist for N 2 2. 


This run shows that the saddle point exists for the plants specified 
for this series of tests, with Q and . as specified, for the elements 


of the R, matrix greater than 2.5. 
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9.2 Comparisons of Optimal and Sub-optimal Strategies 
Run 11 shows the effect of the evader abandoning the optimal 
strategy by not maneuvering at all. The input data is identical to 


that for Run 1. A statement was added in the program to require: 


_ 0 

=~ | : for trajectory simulation. 

The pursuer followed the optimal strategy (assuming the evader would 
do likewise). The resulting trajectories are shown in Figure 13. 


Terminal Performance Data 


- ,0048 CostiRae= 198zae 
SSI CostQ =  .000 
Z(N)= | _ 024 —_ 
1.7527 J = 19.708 


Comparison of the results of this test with those of Run 1 shows 
that (at least for this case) the left-hand inequality of equation (13) 
does indeed hold. Intuitively, the pursuer does "better" if the 
evader does not use his "intelligence". The series of feedback- 


gain matrices is identical to.that for Run 1. 
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Run 12 shows the effect of "improving" the pursuer's knowledge 
of the evader. The input data and the terminal performance data are 
given below. The resulting trajectories are shown in Figure 14. 

Input Data 


Identical to Run 1 and Run 11 except: 


0 0 

_ ogg 
| i 0 
0 0 


(In this case the statement v = 0 is not required 
to prevent the evader from maneuvering.) 


Terminal Performance Data 


- 0750 CostR = 18.748 

30.0041 CostQ = 007 
2(N) = - .0375 

Zeal J. Se 186755 


Comparison of the results of this run with those of Run 11 
shows that the pursuer can indeed do "better" if he has (or uses) 
“better intelligence". Note that in this case "better" means a lower 
value of the performance index, and not necessarily a smaller 


"“miss~-distance", 
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Figure 14 
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Run 13 shows the effect of the pursuer using the Strategy of 
“proportional navigation" against the Same non-maneuvering evader. 
The input data is identical with that for Run 1 and Run 11. The evader 
is again prevented from maneuvering by inserting a statement re- 
quiring v = 0 for the trajectory simulation as in Run 11. Additional 
Statements are added to require the pursuer's speed to remain con- 
Stant, and his rate of turn to be five times the evaders bearing rate 
(as seen from the pursuer). The resulting trajectories are shown in 
Figure 15. 


Terminal Performance Data 


-269.53 Coster = 3.975 4 
ai) 2 See). CoshO = 7. 7/28 10 
= - 68.31 4 
~ 3,34 J = #873 x 10 


The resulting trajectories show vividly the common pitfall of 
trying to compare two strategies based on different assumptions. 
Here the fault is in attempting to compare the derived strategy, which 
allows acceleration in any direction and requires intercept at a 
specified time, with the strategy of proportional navigation, which 
assumes the pursuer and evader are restricted to constant speeds (the 


pursuer faster than the evader) and allows intercept at any time. 
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Figure 15 
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0.3 Application to the One-sided Control Problem 


Run 2] simulates the resulting trajectory of a single System, uSing 
evader dynamics to specify the state of the pursuit system desired at 
the final time. The state vector for the system is defined: 


position 

velocity 
acceleration 
acceleration rate 


(52) 


x — 
The dynamics for the pursuit plant are those of an arbitrary, stable, 
fourth-order system. The dynamics of the evasion plant maintain 
x, constant. The input data and numerical results are given below. 
The resulting trajectory is shown in Figures 16 and 17. The time 


variation of the (1,1) element of the feedback-gain matrix is shown in 


Figure 18. 
Input Data 
0 120 0 0 
- 2 0 0 ja 8 0 
p 0 0 0 1.0 
=—lSso65 -0.69454 =| 25 6330 -1.75930 
0 O 0 Oro 
4 0 - 0 ="0>e0 6 
:e 0 a 7 OnocOietOca@) 
i OY Oe 10 
1000 0 0 0 
0 1000 0 0 _ 
a 0 0 1000 0 ‘a 
0 0 0 1000 |, 
100 40 
= 10 ™ 20 
x (0) } = 10 
0 0 
NN =-50 le ere™ Osea), 
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Numerical Results 


| 1 aie CostR = 12816 
| 23 eames CostQ = 2048 
z(N) = 7 3301 

| -1.4372 J = 14864 


The results of this run demonstrate that the strategy may be 
successfully used to cause a plant to approach a desired state ata 
given time, and that the computer program is capable of simulating 
the results of this application. 

The complete input data, showing required format, and the com- 
puter printout of the numerical results of this run are given in Appendix 
B. The changes made in the main program to provide a useful form 


of graphical computer output for this run are shown in Appendix C. 
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Figure 16 


Position-Velocity-Acceleration Vs. Time 
for Single Plant 
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Figure 17 


Position Vs. Velocity Phase Trajectory 
for Single Plant 
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Figure 18 


Time Variation of the (1,1) Elements 
of the 
Feedback-gain Matrix 
for Single Plant 
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6. CONCLUSIONS 

A class of purSuit-evasion games has been solved, illustrating 
the application of differential game theory and dynamic programming 
to control system design. A computer program has been provided to 
simulate the response of linear systems to controls based on the 
resulting strategy. A set of necessary and sufficient conditions for 
the existence of the solution has been found. Attempts to simplify 
these conditions and to provide an adequate physical interpretation 
of them have been unsuccessful. 

That this strategy successfully generates feedback controls for 
pursuit and evasion plants, for which the system dynamics are linear, 
and linearly related, and for which some form of "intercept" or 
"rendezvous" is desired, has been demonstrated. A certain general- 
ity of application of the strategy is inherent in the ability to choose 
different plants for the pursuer and evader, the ability to choose 
different combinations of weighting matrices in the performance 
index, and the ability to handle arbitrary initial conditions. If the 
plant dynamics aor performance index are stated beforehand, the i 
EAT, hy and B matrices may be computed and tabulated, making 
the "on-line" implementation of the strategy mathematically simple 
and rapid. 

The requirement for a linear relationship between the dynamics of 
the two plants is restrictive. For example, a linear representation of 
the dynamics of one aircraft about its roll, pitch, and yaw axes is 
quite realistic |7 ]. However, the dynamics of two maneuvering air- 
craft expressed about their respective axes are not linearly related. 

A limitation is alSo imposed on the desired system response by the 
form of the performance index. For example, an aircraft commander 
who finds himself on a colliSion course with an intercepting missile 
(2 (k)'Qz (k) = 0), may find the application of this strategy, which 


tells him to conserve his fuel (v(k) = P 2 (k) = 0), very unsatisfying. 


SS 


An obvious area for future investigation is the extension of this 
strategy to include the broader class of pursuit-evasion problems 
having a performance index of the quadratic form: 


J = (x(t) - x(t) QLx. tt) - x, 1 


ie 
+ fof (Lx) - x17 Slx () - x, 0) 


+ ult) Rul) - v(t) Rev (t)} dt (53) 


Simplification of the necessary and sufficient conditions for 
the existence of a saddle point and the physical interpretation of 
these conditions is another area for future investigation. Ho in- 
timates that these conditions are related to the "relative control- 
lability" of the two plants [3 ], which suggest a method of attack 
and the possibility of gaining additional insight into the design of 


pursuit-evasion systems in general. 
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